AITopics | lagrangian method

Collaborating Authors

lagrangian method

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Empirical Study of Lagrangian Methods in Safe Reinforcement Learning

Spoor, Lindsay, Serra-Gómez, Álvaro, Plaat, Aske, Moerland, Thomas

arXiv.org Artificial IntelligenceOct-21-2025

In safety-critical domains such as robotics, navigation and power systems, constrained optimization problems arise where maximizing performance must be carefully balanced with associated constraints. Safe reinforcement learning provides a framework to address these challenges, with Lagrangian methods being a popular choice. However, the effectiveness of Lagrangian methods crucially depends on the choice of the Lagrange multiplier $λ$, which governs the trade-off between return and constraint cost. A common approach is to update the multiplier automatically during training. Although this is standard in practice, there remains limited empirical evidence on the robustness of an automated update and its influence on overall performance. Therefore, we analyze (i) optimality and (ii) stability of Lagrange multipliers in safe reinforcement learning across a range of tasks. We provide $λ$-profiles that give a complete visualization of the trade-off between return and constraint cost of the optimization problem. These profiles show the highly sensitive nature of $λ$ and moreover confirm the lack of general intuition for choosing the optimal value $λ^*$. Our findings additionally show that automated multiplier updates are able to recover and sometimes even exceed the optimal performance found at $λ^*$ due to the vast difference in their learning trajectories. Furthermore, we show that automated multiplier updates exhibit oscillatory behavior during training, which can be mitigated through PID-controlled updates. However, this method requires careful tuning to achieve consistently better performance across tasks. This highlights the need for further research on stabilizing Lagrangian methods in safe reinforcement learning. The code used to reproduce our results can be found at https://github.com/lindsayspoor/Lagrangian_SafeRL.

machine learning, multiplier, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2510.17564

Country: Europe > Netherlands (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

PIGDreamer: Privileged Information Guided World Models for Safe Partially Observable Reinforcement Learning

Huang, Dongchi, Wang, Jiaqi, Li, Yang, Xia, Chunhe, Zhang, Tianle, Zhang, Kaige

arXiv.org Artificial IntelligenceSep-24-2025

Partial observability presents a significant challenge for Safe Reinforcement Learning (Safe RL), as it impedes the identification of potential risks and rewards. Leveraging specific types of privileged information during training to mitigate the effects of partial observability has yielded notable empirical successes. In this paper, we propose Asymmetric Constrained Partially Observable Markov Decision Processes (ACPOMDPs) to theoretically examine the advantages of incorporating privileged information in Safe RL. Building upon ACPOMDPs, we propose the Privileged Information Guided Dreamer (PIGDreamer), a model-based RL approach that leverages privileged information to enhance the agent's safety and performance through privileged representation alignment and an asymmetric actor-critic structure. Our empirical results demonstrate that PIGDreamer significantly outperforms existing Safe RL methods. Furthermore, compared to alternative privileged RL methods, our approach exhibits enhanced performance, robustness, and efficiency. Codes are available at: https://github.com/hggforget/PIGDreamer.

information, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2508.02159

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A proximal augmented Lagrangian method for nonconvex optimization with equality and inequality constraints

Adeoye, Adeyemi D., Latafat, Puya, Bemporad, Alberto

arXiv.org Machine LearningSep-4-2025

We propose an inexact proximal augmented Lagrangian method (P-ALM) for nonconvex structured optimization problems. The proposed method features an easily implementable rule not only for updating the penalty parameters, but also for adaptively tuning the proximal term. It allows the penalty parameter to grow rapidly in the early stages to speed up progress, while ameliorating the issue of ill-conditioning in later iterations, a well-known drawback of the traditional approach of linearly increasing the penalty parameters. A key element in our analysis lies in the observation that the augmented Lagrangian can be controlled effectively along the iterates, provided an initial feasible point is available. Our analysis, while simple, provides a new theoretical perspective about P-ALM and, as a by-product, results in similar convergence properties for its non-proximal variant, the classical augmented Lagrangian method (ALM). Numerical experiments, including convex and nonconvex problem instances, demonstrate the effectiveness of our approach.

artificial intelligence, augmented lagrangian method, machine learning, (14 more...)

arXiv.org Machine Learning

2509.02894

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Europe > Italy (0.04)
Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

db29450c3f5e97f97846693611f98c15-AuthorFeedback.pdf

Neural Information Processing SystemsAug-20-2025, 05:16:50 GMT

assumption 3, feasible point, final version, (15 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.49)
Information Technology > Artificial Intelligence > Machine Learning (0.31)

Add feedback

Inexact Moreau Envelope Lagrangian Method for Non-Convex Constrained Optimization under Local Error Bound Conditions on Constraint Functions

Huang, Yankun, Lin, Qihang, Xu, Yangyang

arXiv.org Artificial IntelligenceFeb-27-2025

In this paper, we study the inexact Moreau envelope Lagrangian (iMELa) method for solving smooth non-convex optimization problems over a simple polytope with additional convex inequality constraints. By incorporating a proximal term into the traditional Lagrangian function, the iMELa method approximately solves a convex optimization subproblem over the polyhedral set at each main iteration. Under the assumption of a local error bound condition for subsets of the feasible set defined by subsets of the constraints, we establish that the iMELa method can find an $\epsilon$-Karush-Kuhn-Tucker point with $\tilde O(\epsilon^{-2})$ gradient oracle complexity.

assumption 3, complexity, dist 2, (14 more...)

arXiv.org Artificial Intelligence

2502.19764

Country:

North America > United States > Iowa (0.04)
North America > United States > Arizona (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Physics-Informed Neural Networks with Trust-Region Sequential Quadratic Programming

Cheng, Xiaoran, Na, Sen

arXiv.org Artificial IntelligenceSep-16-2024

Physics-Informed Neural Networks (PINNs) represent a significant advancement in Scientific Machine Learning (SciML), which integrate physical domain knowledge into an empirical loss function as soft constraints and apply existing machine learning methods to train the model. However, recent research has noted that PINNs may fail to learn relatively complex Partial Differential Equations (PDEs). This paper addresses the failure modes of PINNs by introducing a novel, hard-constrained deep learning method -- trust-region Sequential Quadratic Programming (trSQP-PINN). In contrast to directly training the penalized soft-constrained loss as in PINNs, our method performs a linear-quadratic approximation of the hard-constrained loss, while leveraging the soft-constrained loss to adaptively adjust the trust-region radius. We only trust our model approximations and make updates within the trust region, and such an updating manner can overcome the ill-conditioning issue of PINNs. We also address the computational bottleneck of second-order SQP methods by employing quasi-Newton updates for second-order information, and importantly, we introduce a simple pretraining step to further enhance training efficiency of our method. We demonstrate the effectiveness of trSQP-PINN through extensive experiments. Compared to existing hard-constrained methods for PINNs, such as penalty methods and augmented Lagrangian methods, trSQP-PINN significantly improves the accuracy of the learned PDE solutions, achieving up to 1-3 orders of magnitude lower errors. Additionally, our pretraining step is generally effective for other hard-constrained methods, and experiments have shown the robustness of our method against both problem-specific parameters and algorithm tuning parameters.

augmented lagrangian method, equation, lagrangian method, (17 more...)

arXiv.org Artificial Intelligence

2409.10777

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > Pennsylvania > Centre County > State College (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fast and Scalable Network Slicing by Integrating Deep Learning with Lagrangian Methods

Hu, Tianlun, Liao, Qi, Liu, Qiang, Massaro, Antonio, Carle, Georg

arXiv.org Artificial IntelligenceJan-22-2024

Network slicing is a key technique in 5G and beyond for efficiently supporting diverse services. Many network slicing solutions rely on deep learning to manage complex and high-dimensional resource allocation problems. However, deep learning models suffer limited generalization and adaptability to dynamic slicing configurations. In this paper, we propose a novel framework that integrates constrained optimization methods and deep learning models, resulting in strong generalization and superior approximation capability. Based on the proposed framework, we design a new neural-assisted algorithm to allocate radio resources to slices to maximize the network utility under inter-slice resource constraints. The algorithm exhibits high scalability, accommodating varying numbers of slices and slice configurations with ease. We implement the proposed solution in a system-level network simulator and evaluate its performance extensively by comparing it to state-of-the-art solutions including deep reinforcement learning approaches. The numerical results show that our solution obtains near-optimal quality-of-service satisfaction and promising generalization performance under different network slicing scenarios.

algorithm, requirement, slice configuration, (12 more...)

arXiv.org Artificial Intelligence

2401.11731

Country:

North America > United States > Nebraska > Lancaster County > Lincoln (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Industry:

Telecommunications (0.68)
Information Technology > Networks (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning

Kim, Dohyeong, Oh, Songhwai

arXiv.org Artificial IntelligenceNov-30-2023

As safety is of paramount importance in robotics, reinforcement learning that reflects safety, called safe RL, has been studied extensively. In safe RL, we aim to find a policy which maximizes the desired return while satisfying the defined safety constraints. There are various types of constraints, among which constraints on conditional value at risk (CVaR) effectively lower the probability of failures caused by high costs since CVaR is a conditional expectation obtained above a certain percentile. In this paper, we propose a trust region-based safe RL method with CVaR constraints, called TRC. We first derive the upper bound on CVaR and then approximate the upper bound in a differentiable form in a trust region. Using this approximation, a subproblem to get policy gradients is formulated, and policies are trained by iteratively solving the subproblem. TRC is evaluated through safe navigation tasks in simulations with various robots and a sim-to-real environment with a Jackal robot from Clearpath. Compared to other safe RL methods, the performance is improved by 1.93 times while the constraints are satisfied in all experiments.

constraint, cvar, optimization, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2022.3141829

2312.00344

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SafeDreamer: Safe Reinforcement Learning with World Models

Huang, Weidong, Ji, Jiaming, Zhang, Borong, Xia, Chunhe, Yang, Yaodong

arXiv.org Artificial IntelligenceOct-7-2023

The deployment of Reinforcement Learning (RL) in real-world applications is constrained by its failure to satisfy safety criteria. Existing Safe Reinforcement Learning (SafeRL) methods, which rely on cost functions to enforce safety, often fail to achieve zero-cost performance in complex scenarios, especially vision-only tasks. These limitations are primarily due to model inaccuracies and inadequate sample efficiency. The integration of world models has proven effective in mitigating these shortcomings. In this work, we introduce SafeDreamer, a novel algorithm incorporating Lagrangian-based methods into world model planning processes within the superior Dreamer framework. Our method achieves nearly zero-cost performance on various tasks, spanning low-dimensional and vision-only input, within the Safety-Gymnasium benchmark, showcasing its efficacy in balancing performance and safety in RL tasks. Further details and resources are available on the project website: https://sites.google.com/view/safedreamer.

osrp-lag, safedreamer, world model, (14 more...)

arXiv.org Artificial Intelligence

2307.07176

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Automobiles & Trucks (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

A Globally Convergent Gradient-based Bilevel Hyperparameter Optimization Method

Sinha, Ankur, Gunwal, Satender, Kumar, Shivam

arXiv.org Artificial IntelligenceJun-18-2023

Hyperparameter optimization in machine learning is often achieved using naive techniques that only lead to an approximate set of hyperparameters. Although techniques such as Bayesian optimization perform an intelligent search on a given domain of hyperparameters, it does not guarantee an optimal solution. A major drawback of most of these approaches is an exponential increase of their search domain with number of hyperparameters, increasing the computational cost and making the approaches slow. The hyperparameter optimization problem is inherently a bilevel optimization task, and some studies have attempted bilevel solution methodologies for solving this problem. However, these studies assume a unique set of model weights that minimize the training loss, which is generally violated by deep learning architectures. This paper discusses a gradient-based bilevel method addressing these drawbacks for solving the hyperparameter optimization problem. The proposed method can handle continuous hyperparameters for which we have chosen the regularization hyperparameter in our experiments. The method guarantees convergence to the set of optimal hyperparameters that this study has theoretically proven. The idea is based on approximating the lower-level optimal value function using Gaussian process regression. As a result, the bilevel problem is reduced to a single level constrained optimization task that is solved using the augmented Lagrangian method. We have performed an extensive computational study on the MNIST and CIFAR-10 datasets on multi-layer perceptron and LeNet architectures that confirms the efficiency of the proposed method. A comparative study against grid search, random search, Bayesian optimization, and HyberBand method on various hyperparameter problems shows that the proposed algorithm converges with lower computation and leads to models that generalize better on the testing set.

artificial intelligence, machine learning, optimization problem, (20 more...)

arXiv.org Artificial Intelligence

2208.12118

Country:

North America > United States (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback